Principal component analysis-based filtering improves detection for Affymetrix gene expression arrays
نویسندگان
چکیده
Gene expression array technology has reached the stage of being routinely used to study clinical samples in search of diagnostic and prognostic biomarkers. Due to the nature of array experiments, which examine the expression of tens of thousands of genes simultaneously, the number of null hypotheses is large. Hence, multiple testing correction is often necessary to control the number of false positives. However, multiple testing correction can lead to low statistical power in detecting genes that are truly differentially expressed. Filtering out non-informative genes allows for reduction in the number of null hypotheses. While several filtering methods have been suggested, the appropriate way to perform filtering is still debatable. We propose a new filtering strategy for Affymetrix GeneChips®, based on principal component analysis of probe-level gene expression data. Using a wholly defined spike-in data set and one from a diabetes study, we show that filtering by the proportion of variation accounted for by the first principal component (PVAC) provides increased sensitivity in detecting truly differentially expressed genes while controlling false discoveries. We demonstrate that PVAC exhibits equal or better performance than several widely used filtering methods. Furthermore, a data-driven approach that guides the selection of the filtering threshold value is also proposed.
منابع مشابه
Gene filtering by PCA for Affymetrix GeneChips
Due to the nature of the array experiments which examine the expression of tens of thousands of genes (or probesets) simultaneously, the number of null hypotheses to be tested is large. Hence multiple testing correction is often necessary to control the number of false positives. However, multiple testing correction can lead to low statistical power in detecting genes that are truly differentia...
متن کاملProbe Selection and Expression Index Computation of Affymetrix Exon Arrays
BACKGROUND There is great current interest in developing microarray platforms for measuring mRNA abundance at both gene level and exon level. The Affymetrix Exon Array is a new high-density gene expression microarray platform, with over six million probes targeting all annotated and predicted exons in a genome. An important question for the analysis of exon array data is how to compute overall ...
متن کاملNormalization of single-channel DNA array data by principal component analysis
MOTIVATION Detailed comparison and analysis of the output of DNA gene expression arrays from multiple samples require global normalization of the measured individual gene intensities from the different hybridizations. This is needed for accounting for variations in array preparation and sample hybridization conditions. RESULTS Here, we present a simple, robust and accurate procedure for the g...
متن کاملMapping of trans-acting regulatory factors from microarray data
To explore the mapping of factors regulating gene expression, we have carried out linkage studies using expression data from individual transcripts (from Affymetrix microarrays; Genetic Analysis Workshop 15 Problem 1) and composite data on correlated groups of transcripts. Quality measures for the arrays were used to remove outliers, and arrays with sex mismatches were also removed. Data likely...
متن کاملPackage ‘ xps ’
Description The package handles pre-processing, normalization, filtering and analysis of Affymetrix GeneChip expression arrays, including exon arrays (Exon 1.0 ST: core, extended, full probesets), gene arrays (Gene 1.0 ST) and plate arrays on computers with 1 GB RAM only. It is an R wrapper to XPS (eXpression Profiling System), which is based on ROOT, an object-oriented framework developed at C...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره 39 شماره
صفحات -
تاریخ انتشار 2011